Evaluating and correcting phoneme segmentation for unit selection synthesis
نویسندگان
چکیده
As part of improved support for building unit selection voices, the Festival speech synthesis system now includes two algorithms for automatic labeling of wavefile data. The two methods are based on dynamic time warping and HMM-based acoustic modeling. Our experiments show that DTW is more accurate 70% of the time, but is also more prone to gross labeling errors. HMM modeling exhibits a systematic bias of 15 ms. Combining both methods directs human labelers towards data most likely to be problematic.
منابع مشابه
Syllable Specific Unit Selection Cost Function Using a Tone Modeling Technique for Automatic Phonetic Segmentation of Hindi Speech Using HMM
This paper presents a technique of improving tone correctness in speech synthesis of a tonal language based on an average-voice model trained with a corpus from nonprofessional speakers speech. Unit selection-based concatenative synthesis is one of the widely used speech synthesis approaches. This approach overcomes the limitations of other synthesis techniques such as articulatory synthesis an...
متن کاملAutomatic error detection in alignments for speech synthesis
The phonetic segmentation of recorded speech is a crucial factor in the quality of concatenative systems for speech synthesis. We describe a a likelihood-based error detection process that can be used to flag possible errors in such a segmentation, with a view towards manual correction. It is shown that this process can be used to assist in the creation of high-accuracy segmentations. In partic...
متن کاملAutomatic Phoneme Segmentation with Relaxed Textual Constraints
Speech synthesis by unit selection requires the segmentation of a large single speaker high quality recording. Automatic speech recognition techniques, e.g. Hidden Markov Models (HMM), can be optimised for maximum segmentation accuracy. This paper presents the results of tuning such a phoneme segmentation system. Firstly, using no text transcription, the design of an HMM phoneme recogniser is o...
متن کاملFully automatic segmentation for prosodic speech corpora
While automatic methods for phonetic segmentation of speech can help with rapid annotation of corpora, most methods rely either on manually segmented data to initially train the process or manual post-processing. This is very time-consuming and slows down porting of speech systems to new languages. In the context of prosody corpora for text-to-speech (TTS) systems, we investigated methods for f...
متن کاملAutomatic Speech Segmentation Based on HMM
This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech...
متن کامل